AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Neural Information Processing SystemsFeb-18-2026, 15:56:33 GMT

f10ceee5c6979988f334058561cac89f-Paper-Conference.pdf

large language model, machine learning, natural language, (19 more...)

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
(3 more...)

Neural Information Processing SystemsFeb-9-2026, 01:45:15 GMT

7a674153c63cff1ad7f0e261c369ab2c-AuthorFeedback.pdf

artificial critical point, discontinuous function, selection derivative, (12 more...)

Genre: Research Report > New Finding (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

arXiv.org Artificial IntelligenceNov-17-2025

Unsupervised Cycle Detection in Agentic Applications

George, Felix, Kumar, Harshit, Pathak, Divya, Ray, Kaustabha, Verma, Mudit, Moogi, Pratibha

Agentic applications powered by Large Language Models exhibit non-deterministic behaviors that can form hidden execution cycles, silently consuming resources without triggering explicit errors. Traditional observability platforms fail to detect these costly inefficiencies. We present an unsupervised cycle detection framework that combines structural and semantic analysis. Our approach first applies computationally efficient temporal call stack analysis to identify explicit loops and then leverages semantic similarity analysis to uncover subtle cycles characterized by redundant content generation. Evaluated on 1575 trajectories from a LangGraph-based stock market application, our hybrid approach achieves an F1 score of 0.72 (precision: 0.62, recall: 0.86), significantly outperforming individual structural (F1: 0.08) and semantic methods (F1: 0.28). While these results are encouraging, there remains substantial scope for improvement, and future work is needed to refine the approach and address its current limitations.

large language model, natural language, trajectory, (16 more...)

2511.1065

Country: North America > United States (0.31)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.98)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Celledoni, Elena, Owren, Brynjulf, Ruthotto, Lars, Yang, Tianjiao Nicole

Mixed Precision Training of Neural ODEs

arXiv.org Artificial IntelligenceOct-28-2025

Exploiting low-precision computations has become a standard strategy in deep learning to address the growing computational costs imposed by ever larger models and datasets. However, naively performing all computations in low precision can lead to roundoff errors and instabilities. Therefore, mixed precision training schemes usually store the weights in high precision and use low-precision computations only for whitelisted operations. Despite their success, these principles are currently not reliable for training continuous-time architectures such as neural ordinary differential equations (Neural ODEs). This paper presents a mixed precision training framework for neural ODEs, combining explicit ODE solvers with a custom backpropagation scheme, and demonstrates its effectiveness across a range of learning tasks. Our scheme uses low-precision computations for evaluating the velocity, parameterized by the neural network, and for storing intermediate states, while stability is provided by a custom dynamic adjoint scaling and by accumulating the solution and gradients in higher precision. These contributions address two key challenges in training neural ODE: the computational cost of repeated network evaluations and the growth of memory requirements with the number of time steps or layers. Along with the paper, we publish our extendable, open-source PyTorch package rampde, whose syntax resembles that of leading packages to provide a drop-in replacement in existing codes. We demonstrate the reliability and effectiveness of our scheme using challenging test cases and on neural ODE applications in image classification and generative models, achieving approximately 50% memory reduction and up to 2x speedup while maintaining accuracy comparable to single-precision training.

artificial intelligence, deep learning, machine learning, (17 more...)

2510.23498

Country: North America > United States > Tennessee (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 21:08:53 GMT

Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems

ensemble, international conference, low precision, (13 more...)

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Neural Information Processing SystemsOct-3-2025, 07:56:24 GMT

7a674153c63cff1ad7f0e261c369ab2c-AuthorFeedback.pdf

artificial critical point, discontinuous function, selection derivative, (12 more...)

Genre: Research Report > New Finding (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

arXiv.org Artificial IntelligenceApr-22-2025

FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference

Hooper, Coleman, Sakr, Charbel, Keller, Ben, Venkatesan, Rangharajan, Keutzer, Kurt, Shao, Sophia, Khailany, Brucek

Quantization is a powerful tool to improve large language model (LLM) inference efficiency by utilizing more energy-efficient low-precision datapaths and reducing memory footprint. However, accurately quantizing LLM weights and activations to low precision is challenging without degrading model accuracy. We propose fine-grained mixed precision (FGMP) quantization, a post-training mixed-precision quantization hardware-software co-design methodology that maintains accuracy while quantizing the majority of weights and activations to reduced precision. Our work makes the following contributions: 1) We develop a policy that uses the perturbation in each value, weighted by the Fisher information, to select which weight and activation blocks to keep in higher precision. This approach preserves accuracy by identifying which weight and activation blocks need to be retained in higher precision to minimize the perturbation in the model loss. 2) We also propose a sensitivity-weighted clipping approach for fine-grained quantization which helps retain accuracy for blocks that are quantized to low precision. 3) We then propose hardware augmentations to leverage the efficiency benefits of FGMP quantization. Our hardware implementation encompasses i) datapath support for FGMP at block granularity, and ii) a mixed-precision activation quantization unit to assign activation blocks to high or low precision on the fly with minimal runtime and energy overhead. Our design, prototyped using NVFP4 (an FP4 format with microscaling) as the low-precision datatype and FP8 as the high-precision datatype, facilitates efficient FGMP quantization, attaining <1% perplexity degradation on Wikitext-103 for the Llama-2-7B model relative to an all-FP8 baseline design while consuming 14% less energy during inference and requiring 30% less weight memory.

large language model, machine learning, quantization, (18 more...)

2504.14152

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Arar, El-Mehdi El, Filip, Silviu-Ioan, Mary, Theo, Riccietti, Elisa

Mixed precision accumulation for neural network inference guided by componentwise forward error analysis

arXiv.org Artificial IntelligenceMar-19-2025

Mixed precision accumulation for neural network inference guided by componentwise forward error analysis El-Mehdi El arar 1, Silviu-Ioan Filip 1, Theo Mary 2, and Elisa Riccietti 3 1 Inria, IRISA, Universit e de Rennes, 263 Av. G en eral Leclerc, F-35000, Rennes, France 2 Sorbonne Universit e, CNRS, LIP6, 4 Place Jussieu, F-75005, Paris, France 3 ENS de Lyon, CNRS, Inria, Universit e Claude Bernard Lyon 1 LIP, UMR 5668, 69342, Lyon cedex 07, France Abstract This work proposes a mathematically founded mixed precision accumulation strategy for the inference of neural networks. Our strategy is based on a new componentwise forward error analysis that explains the propagation of errors in the forward pass of neural networks. Specifically, our analysis shows that the error in each component of the output of a layer is proportional to the condition number of the inner product between the weights and the input, multiplied by the condition number of the activation function. These condition numbers can vary widely from one component to the other, thus creating a significant opportunity to introduce mixed precision: each component should be accumulated in a precision inversely proportional to the product of these condition numbers. We propose a practical algorithm that exploits this observation: it first computes all components in low precision, uses this output to estimate the condition numbers, and recomputes in higher precision only the components associated with large condition numbers. We test our algorithm on various networks and datasets and confirm experimentally that it can significantly improve the cost-accuracy tradeoff compared with uniform precision accumulation baselines. Keywords: Neural network, inference, error analysis, mixed precision, multiply-accumulate 1 Introduction Modern applications in artificial intelligence require increasingly complex models and thus increasing memory, time, and energy costs for storing and deploying large-scale deep learning models with parameter counts ranging in the millions and billions. This is a limiting factor both in the context of training and of inference. While the growing training costs can be tackled by the power of modern computing resources, notably GPU accelerators, the deployment of large-scale models leads to serious limitations in inference contexts with limited resources, such as embedded systems or applications that require real-time processing.

artificial intelligence, deep learning, machine learning, (18 more...)

2503.15568

Country:

Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.24)
Europe > France > Île-de-France > Paris > Paris (0.24)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.24)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-21-2024

GreenEye: Development of Real-Time Traffic Signal Recognition System for Visual Impairments

Kim, Danu

Recognizing a traffic signal, determining if the signal is green or red, and figuring out the time left to cross the crosswalk are significant challenges to visually impaired people. Previous research has focused on recognizing only two traffic signals, green and red lights, using machine learning techniques. The proposed method developed a GreenEye system that recognizes the traffic signals' color and tells the time left for pedestrians to cross the crosswalk in real-time. GreenEye's first training showed the highest precision of 74.6%; four classes reported 40% or lower recognition precision in this training session. The data imbalance caused low precision; thus, extra labeling and database formation were performed to stabilize the number of images between different classes. After the stabilization, all 14 classes showed excelling precision rate of 99.5%.

artificial intelligence, deep learning, machine learning, (13 more...)

2410.1984

Country:

Asia > South Korea > Ulsan > Ulsan (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > North Korea (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)